test: +1.69% coverage on src/runtime via 7 targeted test files#15
Merged
Conversation
Targeted coverage uplift across the framework core's pure helpers and
two integration boundaries. Local src/runtime coverage rises from
87.21% to 88.90% (+1.69%); 97 previously-uncovered lines now exercised.
New test files (7):
* tests/test_llm_stub_structured_output.py
StubChatModel.with_structured_output happy path
(llm.py:141-160, 171-177). The defensive model_validate fallback
(lines 161-169) is excluded with a documented justification:
pydantic v2's model_validate calls __init__, so any schema whose
constructor raises also fails the fallback.
* tests/test_envelope_recovery.py
_try_recover_envelope_from_raw (graph.py:583-610). Table-driven
across the three candidate-substring strategies (raw, fenced,
greedy first-{...-last-}) and every failure path.
* tests/test_orchestrator_extract_last_error.py
Orchestrator._extract_last_error (orchestrator.py:945-998).
Pure mapping from a failed-AgentRun summary string to a
representative typed exception. Table-driven across
EnvelopeMissingError, ValidationError, TimeoutError,
OSError, RuntimeError fallback, plus reversed-iteration ordering.
* tests/test_handle_agent_failure.py
_handle_agent_failure (graph.py:613-644). Both the happy path
(reload + append + status='error') and the FileNotFoundError
fallback path (use caller's in-memory session). Plus a
partial-tool-write preservation regression test.
* tests/test_retry_session_locked_post_policy.py
Orchestrator._retry_session_locked post-policy execution path
(orchestrator.py:1552-1587). Stub orchestrator pulls in the
real method body and substitutes the surrounding integration
points (graph, finalize, pause). Covers the failed-AgentRun
filter, retry_count + active_thread_id pinning, and the
pause-vs-finalize fork.
* tests/test_service_run_exception_branches.py
OrchestratorService.start_session._run exception branches
(service.py:541-568). Three classes get distinct treatment:
CancelledError (propagate as-is), GraphInterrupt (propagate
WITHOUT marking registry status='error' -- HITL pause is not a
failure), generic Exception (mark error then propagate).
* tests/test_sse_tail_loop.py
SSE _stream tail-poll loop (api.py:879-890), including the
CancelledError re-raise from PR #13. Two tests: one drives the
tail to deliver a post-drain event, one forces sleep to raise
CancelledError and asserts it propagates (pinning the bug fix).
Verification:
ruff check src/ tests/ passed
pyright src/runtime 0 errors / 0 warnings
pytest -x 1310 passed / 8 skipped (was 1265/8)
pytest --cov=src/runtime --cov-fail-under=85 88.90% (was 87.21%)
build_single_file.py dist unchanged (tests only)
Projected SonarCloud impact (after the Phase 1 PR #13 exclusion sync):
new_coverage ~87.2% -> ~89-90%
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Phase 2 of the Sonar gate-green effort. PR #13 (now merged) flipped the gate via config + bug fix; this PR closes coverage gaps with new tests.
Coverage delta
src/runtime/line coverageWhat's covered
llm.py:141-160, 171-177test_llm_stub_structured_output.pyStubChatModel.with_structured_outputhappy pathgraph.py:583-610test_envelope_recovery.py_try_recover_envelope_from_raw(table-driven across raw / fenced / greedy strategies)orchestrator.py:945-998test_orchestrator_extract_last_error.py_extract_last_errorexception-class mapping (table-driven)graph.py:613-644test_handle_agent_failure.pyFileNotFoundErrorfallback + partial-tool-write preservationorchestrator.py:1552-1587test_retry_session_locked_post_policy.py_retry_session_lockedpost-policy execution path (filter, thread-id pin, pause/finalize fork)service.py:541-568test_service_run_exception_branches.py_runexception branches:CancelledError/GraphInterrupt/ genericapi.pySSE tail looptest_sse_tail_loop.pyCancelledErrorpropagation (pins PR #13's behaviour)What's intentionally NOT covered
StubChatModel.with_structured_outputpermissivemodel_validatefallback (llm.py:161-169) — pydantic v2'smodel_validateultimately calls__init__, so any constructor-failing schema also fails the fallback. The branch exists for hypothetical schemas with custom__pydantic_validator__overrides; it can't be cleanly unit-tested without monkey-patching pydantic internals. Documented in the test file.Test plan
uv run ruff check src/ tests/— passeduv run pyright src/runtime— 0 errors / 0 warningsuv run pytest -x— 1,310 passed / 8 skippeduv run pytest --cov=src/runtime --cov-fail-under=85 -x— 88.90%src/runtime/changes →dist/*unchanged → bundle staleness gate (HARD-08) is a no-opnew_coveragerises further on the merged main🤖 Generated with Claude Code